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A NOTE  OIT  NON~ PARAMETRIC  METHODS 


By 

I o ivi  chard  Savage 
Hati©nai  Bureau  of  Standards 

SUMMARY 

The  main  tools  of  non-par am© trie  statistics  &r©  described*. 
It  is  pointed  out  that  some  of  these  tools  d©  not  hav@ 
natural  extensions  t©  multivariate  problems*.  The  possibility 
of  finding  tests  of  randomness  that  are  independent  of 
ordinary  tests  of  hypotheses  is  explained*, 

I INTRODUCTION 

The  object  of  non-parame trie  methods  is  to  make  proba- 
bility statements  about  random  variables  when  nlittl@'n’  is 
known  of  the  distribution  of  these  random  variables 0 It  Is 
hard  to  give  th©  precise  meaning  ©f  wlittie5"  in  the  above 
statement©  It  refers  her©  to  the  fact  that  all  we  know  about 
th©  random  variables  of  interest  can  be  phrased  in  terms  ©f 
such  general  function  theory  concepts  as  continuity  of  th© 
distribution  functions*,  the  existence  of  a density  function*, 
th©  possibility  of  factoring  the  distribution  functions  that 
are  involved 9 etc© 

It  is  interesting  to  note  that  most  of  the  work  that 
has  been  done  in  a non-par ana trie  manner  has  been  concerned 
with  real-valued  (scalar)  random  variables*,  Exceptions  to 
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this  are®  the  work  done  on  multi dimensional  tolerance  limit® 

[2]  and  rank  correlation  [$]o  We  shall  point  out  that  fehls 
is  a natural  limitation  of  the  common  tools  that  have  been 
used  for  most  of  this  work*,  i0so  th©  use  of  th©  probability 
integral  transform*,  ©f  topological  invariance*,  of  randomisation 
and  ©f  dish© torn! &&ti on P 

IX  PROBABILITY  INTEGRAL  TRANSFORM 

The  probability  integral  transform  technique  says  that 
if  Ffx)  is  the  cumulative  distribution  function  (continuous) 
of  th©  random  variable  X*,  then  th©  new  random  variable  Y » 

F(X)  has  a uniform  distribution.?  This  fact  has  been  most 
useful  In  achieving  by  non®parametric  methods  such  results 
as  finding  confidence  intervals  for  th©  median  [15 3 » finding 
confidence  Intervals  for  a distribution  by  th©  W&Xd-Wolfowifcs 
method  [11]*  developing  the  Ns  yin  an  smooth  test  [9  Jo  ©too 

Simpson  [10]  pointed  out  that  the  corresponding  trans- 
formation for  two-dimensional  random  variables 8 and  con- 
sequently for  higher  dimensions  does  not  have  the  desired 
properties 0 However*,  in  facing  a multivariate  non-parometric 
problem  it  Is  sometimes  possible  t®  consider  a real -valued 
random  variable  that  is  a function  of  th©  several  original 
random  variables*,  and  in  this  manner  the  problem  can  be  reduced 
to  one  that  can  be  handled o This  is  essentially  what  Simpson 
did  in  [10jo 

Mo  Rosenblatt  in  his  paper  ('’Limit  theorems  associated 
with  variants  of  the  von  Mises  statistic”  presented  at  the 
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March  21 P 1952  contributed  paper  session  ©f  th®  Slst  meeting 
of  th©  Institute  of  Mathematical  Statist!©©) 8 has  gi won  m 
analogue  to  th©  probability  integral  transformation  which 
should  prove  useful  in  future  mu! tivari  atep  non-parametrie 
studieso 

III  TOFOTjOOTCAL  invariance 

Tf  points  are  placed  on  a line*  and  then  a topological 
transformation  of  th®  lino  into  itself  is  mad©*,  th©  order 
of  th©  points  on  th©  line  either  remains  th©  same  or  is  reversed 
Many  non-parametric  procedures  depend  on  this  fact*,  ic@0  th© 
Wald-Wol fowl fcz  test  of  goodness =©f~fi t using  the  total  number 
of  runs  [ 12 ] 0 the  Mann  test  of  randomness  [73©  the  Mann- 
Whitney  two-sample  test  [83c  ©fcco 

Wolf cults  [16]  pointed  out  that  for  higher  dimension© 
there  was  no  analogue  to  this  procedure 0 This  is  most  un- 
fortunate since  it  has  turned  out  that  this  is  one  of  th© 
most  useful  of  the  non -par am© trie  method© a 

2t  is  interesting  to  note  that  rank  order  statistics 
and  symmetric  statistics  are  independent*,  given  a sample 
of  real-valued  continuous  random  variables  that  are  identically 
and  independently  distributed  [U3*  By  a rank  order  statistic 
is  meant  a statistic  which  depends  only  on  the  order  in  which 
the  ranked  variables  were  observed*,  and  thus  using  the 
topological  Invariance p its  distribution  under  random  sampling 


ti  xdiXM 

)ii  j^A  1’:  .TAOTaOi-TO^OT.  T . q 

iMsisoloqoi  nef<3  6i»  ,»nlS  a no  baoaXq  ata  eSeloq  In 

,9VB„  ,i  « *«a  a^-aMn  an U aU  no  a^oq  ^ ^ | 

;,.t  „o.,:  .-oat  BO  *«eqob  M*tfbaoo*q  »W«BW|-«fl  ^ 

o0.  ;,  n "81  3 8<tf  ®J!||s.c*a*or.=  t:rL‘  • 

^ ...  .*  «i  «M  .o^bo**  .|fi«  «*  ^ao.ton«  on  aa«  #«« 

: -.  . •« 

; - > :eot'!S«2sq*ftOfJ.  orfi 

: : « ..  . « « - - - ' 

. 

816  ^ ^ 

„ . -.  ■>  ■.b.v’sc  or.«  tv,  *In.  .***»»  rioMw  ** 

9,,  SBiat  «»,*  bn.  .non 

, „ M ©tft  a©»rtBl‘XBvrix  .is&r^o.^oqoi 

gsslXqm.aB  iu  'bwi  *9bQu  £ lfJC  " 


is  non^par&m© tri c <?  Examples  ©r  such  a?©  the  longest  run 
above  the  median  and  the  statistics  used  by  Mann  |7] 

fc@  test  for  trendo  By  a symmetric  statistic  is  meant  a 
syjnmetric  function  of  the  sample  observation® 0 The  median*, 
the  mean*,  ®tco0  of  the  sample  etc*,  are  symmetric  statistics,: 

The  proof  ©f  the  independence  theorem  for  rank  order  statistics 
and  symmetric  statistics  and  its  extension  to  several  samples 
is  simple „ The  independence  ©f  rank  order  statistics  and 
symmetric  statistics  is  useful  in  situations  where  we  wish 
to  test  the  hypothesis  that  we  have  a random  sample  from  a 
population  with  some  specific  property;  for  example B th© 
hypothesis  that  w®  have  a random  sample  from  a population  with 
median  equal  to  sero«  Ordinarily  in  this  type  of  testing 
problem  all  of  th©  emphasis  is  put  on  testing  whether  th© 
median  is  equal  to  zero;  but  in  many  situations  it  would 
also  be  Important  to  test  whether  the  sample  was  randoms 
This  method  allows  us  to  test  both  ports  of  th©  hypothesis 
with  tests  which  are  independent  of  each  other  when  th©  null 
hypothesis  is  true*  Thus  it  is  possible  to  get  known  significance 
levels  [lij]o 

IV  RANDOMIZATION 

The  method  of  randomization  is  based  on  the  fact  that 
under  the  assumption  of  random  sampling  we  can  find  the  exact 
conditional  distribution  for  any  statistic  once  we  have  made 
the  set  of  observations  [12 ]<> 
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In  this  method  there  Is  rio  problem  in  extending  feh© 
technique  to  the  multivariate  case©  Her©  the  main  problem 

consists  in  the  excessive  computation  required©  llius  we 
would  hav©  to  cake  a separata  table  giving  th®  distribution 
for  each  sample  drawn©  It  does  not  seem  possible  t©  give 
a simple  technique  for  making  exact  probability  statements 
using  this  method^  although  the  asymptotic  theory  has  been 
worked  out  in  detail  T 3 3 ^ for  th©  univariat©  ease© 

V DXCHTOKIZATTON 

Many  problems  where  non^parame tri e methods  are  called 

for  can  be  so  transformed  that  all  of  the  variables  involved 

have  binomial  distributions©  Typical  ©f  such  problems  is 

the  testing  of  whether  the  median  of  a continuous  distribution 

the 

is  equal  to  zero©  Here  one  classifies/observations  as  positive 
or  negative!  then  one  proceeds  by  using  the  fact  that  the 
number  of  positive  and  the  number  of  negative  observations 
have  a binomial  distribution© 

By  changing  one®s  problems  so  that  they  only  involve 
binomial  variates  one  quickly  has  the  required  distribution 
theory©  This  method  has  been  used  by  Lehmann  [6]  for  attacks 
ing  multivariate  problems©  Perhaps  the  biggest  drawback  in 
using  this  method  is  that  for  some  problems  it  appears  t©  be 
very  inefficient  [6]© 


■ J ; di  c ■ t aj  . cMd  hi 

..  . ©rfd  c’.;H  ncz.20  edBl'm'vli lu m c4  Qispirx: 

i <,  bootup®*!  nolS&StsGjfioo  evlsaeoxs  ©xid  ni  e^eietH 
Usd I*i  . b : d ’ gfi  ivia  ©Id  S :r  •;  ■ ■.. _ . 8 B ■ - i ' v. 
ovi:$  i ©Idiesoq  moss  v oa  sodfc  SI  n£?j&%h  ©Iqiass  doss  f. 
cdao-tfr&ida  \S  xlldzdo^q  dacx©  nol  ©jurplndst'4  elqsxis  ...... 

8Brf  ^«xb©dt  ©idod<pE^sj8  ©rf  d rfgqorfdls  s&ojfders  sMd  gnl-eir 
Cc;xr  od.jsl^isviifisj  oris  re '. . « [£ ] Ilsdeb  n_:  dx/o  b:»>; 


FO  IT  AS  JMOTHD  Id  V 

roi.i'C  e*xfi  sborfd.ew  & X«rd Qmansq^nort  o^oxfw  enidldo^q  ^.oeM 

;x.  i~:I  ec ©rid  T©  XIb  derid  ftexiwelens^id  oe  ©d  rso  •o.cT 

ci  ssaeJL'do’iq  rieua  1©  lablqx^  »8«oidi#dIiddlb  Is  Inter;  lcr  svx  :C 

xrdi  ri  e I&  e&ounldno©  & To  nsi&exa  ©rfd  «x©rfd©rfrf  To  grttofetd  ©r.i 

©rfd 

©vl:J  i :r.eq  ss  aucld.:-  vtscEdc\r©Ili£esIo  ©fro  e*s©H  . a'0*x©s  6$  l£upa  cl. 
edS  SsdS  *$dal  ©rid  gniSii  y<*  sbeaoo^iq  ©no  a©rfd  (©Yltogen  *to 
snceJ  . yznzic  ©vJ.t.Ggsn  To  rtecfoutft  ©rid  &ns  evldits'oq  To  onoV...:: 


■.fioilirdir^sib  >;.3jzaonid  :i  svsd 
colour  I ; jo  tj  Jodi  .>o  astaldcrcq  e®@«o  <•  nigrrsdo  y& 

. • •;•!  Hislb  be*xlxjpa*x  odd  asx£  XX&QJbffp  otto  satol'xsy  £a  jbEonld 

; '•..  >rf  3 IDi  r T]  ••'XT  .:I  ! &03XJ  CXQQCf  8j8rf  borfd  SSI  SlrfC  $30©.d  : 

"■  3l a • c-  Jtb  d3©gg:  2d  ©rfd  orsrfroT  . 3as©Itfo*tq  ddBbrfiVld.l&si  v : I 
r '■  eeBjqq  ; di  a.vxaldoiq  xxeoo  «iol  dfirfd  el  boridorx  sirit  gftjir.-o 


O [ 0 ] d £10 1 0 1'  "tt©  ft  I e •!  3 


6 


lo  Batsmen p Go  On  the  power  function  ©f  the  longest  pun  as 
a test  for  randomness  in  a sequence  of  alternatives* 
Bi©m8trika3  3$  (1948)° 

20  Fraser,  Do  A©  &Q  Sequentially  determined  statistically 
equivalent  blocks©  American  Mathematical  S@©i@tyP 
22  (1951).  Po  372 © - ' “ ==^_~ 

3©  Hoeff dings,  A combinatorial  central  limit  th©@r©m«, 

Annals  of  Mathematical  Statistics*,  22  (1951)  Po 
558," " n®  © %T  J 

4o  H© telling p Ho  and  Pabst0  Mo  R©  &ank  correlation  and 
tests  of  significance  involving  n®  assumption  of 
n®rmality0  Annals  @f  Mathematical  Statistics*,  ? 

(1936)  po  Zlo  " " ' "" 

5o  Kendall 9 Mo  ®o  ^ank  Correlation  Methods.,  Charles  Griffin 
Company  Lirif_t®^7""Lond®n0 1946"o  ™" 

6o  Lehmann 0 E©  Consistency  and  unbiasedness  of  certain  non=> 
parametric  testo  Annals  of  Mathematical  Statistics,, 

22  (19515  Po  165«  “ “ — ““ 

7o  Mannp  He.  On  a test  for  randomness  based  on  signs  of 
difference©  Annals  of  Mathematical  Statistics,,  16 

(19455  8 Po  19 To “ ~ — — 

8o  Marm0  H©  and  Whitney^  D©  Ro  On  a test  of  whether  one 
of  two  random  variables  is  stochastically  larger 
than  the  other©  Annals  ©f  Mathematical  Statistics^ 

18  (1947)  Po  50o  ” ' " " " ' " " " — — — “^ 

9©  Neyro©n„  Jo  ni>mooth  Test”  for  goodness  of  fit©  Skandlnavisk 
Aktuarie tidskrl te  (1937)  Po  149o 

10o  Simpson,,  Pc  Note  on  the  estimation  of  a bivariate  distri= 
bution  function©  Annals  of  Mathematical  Statistics*, 

22  (1951)  Po  476  o " “ 

11 o Walds  A0  and  Vfolfowifcz  J©  Confidence  limits  for  continuous 
distribution  functions©  Annals  of  Mathematical 
Statistics,,  10  (1939)  po  TBITS — 

12 o Waid0  A0  and  Wolfowitz,  Jo  On  a test  whether  two  samples 
are  from  the  same  population 0 Annals  of  Mathematical 
Statistics g XI  (19iiO)  po  l47o  ““ 


t . mMaasft  ad;..- 

■■•»  ■ - a n|  :.:q'i  :r< 

l 

-i i. ®h  irlls.Mf. : 


$ MIX  Xs’tfnso  Xai«iotBnldfisoo  A 

gaXvXevitX  ®©£tsoX^lri3i:3 
• ,.!  ■-/-■■  3 j 1®  bX/skoA  c.^I ^ ^ 

■ - dTO:'\^  •-SSHoXTrx/  rrk.::,.  ..  : ;-oJ 

-. 1 - i:  si$  at  s s>  lo  ?:.  s e ah  © bs  id  ita  i*tfi  % 9it&$  si©  sc  ^ <>2  Q ti  ■ £$tss  & J 

■•■-:  |:J|2  ' ’ /■  "ft  ° :{ s&$  r - ; 

o c€?  £ o q i £ -*> £ i5ri 


0 e^ie  c5'-  :;".ad  6S©£8tt©hs»*J  Wl  deed  & isd  v- 

dir  .;^dr  x»ol*«n&4K  lo  lliili  pt.8fl«erttt 

■■  ZlQl  oq  dc^J 


oxei^exfw  1©  4a&d  4 £10  oFl  oO  has  rd  ,B 

•sesifiX.  ^IXsoXdeaxioode  sx  s&  fete  1*26  v mohxmi  ow«  juo 

. ..  . o©^  o-q,  CY4W  «£■ 


,-'  n 1®  sVanhodg  *xol  "tfasT  ddoe®^**  oX, 

0§4X  oq  CTC9X)  fltfX^8bl3efq»iKfraiA 


a 16  RoXtBal$9*  oo  frd©*‘ 

* 

~~~~~ — o€»V4  .oq  (X<*9X)  ^ 


• ••  «jCl  "•  • »;£f  ®$£X&bilff®9  ot»  ssTXwolXoW  fcft®  o&^Libv^ 

' : - • • * : .-  *»;  "•  •::  ~<t  so.:.  4 aZXvUoijin  nolfud^v ew 

“— ?3PX’  ; (£££X)  OX  '3.';i*«xffrt8 


•^fiderfv  *®®d  * aO  oL  *s*iwo!X©W  b«A  *A  *M®W 

: : :■  • -5  ’?.X:3C;  V'  oOr^Sl-Wq  MSB  ®ti$  ?? 

- ... - - — oT4ix  X (Oiiex)  XX  «n®m|*g*e 


13 o W aid D and  Wolfowi J«  Statistical  tests  based  on 

permutations  ©f  the  observations 0 Annals  ©f  Mathematical 
Statistics fl  15  (I9kk)  No.  ko  —— —————— 


11*0  Wallis ff  Compounding  probabilities  from  independent 

significance  t©stsQ  Jbj©onom®tri©a0  10  (1942)  No.  3o 

15 o Wilks e Order  statistics©  Bulletin  of  the  American 

Mathematical  Society g 54  ( 1945TTJ© 7 ’’'T'r'^S^TTT" P © " €7~' 

16 o W©if ©wi ts 0 Jo  Additive  partition  functions  and  a class 
©f  statistical  hyp@th©s@^o  ^srnala  ©f  Mathematical 
Statistics^  ij[  (1942)  Po  24? o 


Jim©  3 9 1:>52 


O o 


••x.  ' ®d$  1©  snic 

.4  . 1 . C 

r ‘ '■■  •»  . ' -•  ' : • .V 


. ' v. ''-w.r-i  i-  r;  . 

•.  : X ' V:  1 cv-idsneq 


■■!...■..  .!.“.r3  ‘ '7.  . c c<  -'y£y  •'•  i 


1 : , sc  csCcK  iJV 

; ■ . LSirf'  o®8»(M8bTOf  iflX  ' 3 X" 


